Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for libvips in addition to ImageMagick #30090

Merged
merged 10 commits into from
Jun 5, 2024
Merged

Conversation

Gargron
Copy link
Member

@Gargron Gargron commented Apr 26, 2024

Libvips can be used instead of ImageMagick with MASTODON_USE_LIBVIPS=true. Enabled by default in the Docker image.

  • Removes metadata
  • Retains color profile
  • Reads pixels for blurhash
  • Reads histograms for color extraction
  • Crops animated GIFs when needed

Fixes MAS-223

@Gargron Gargron added the performance Runtime performance label Apr 26, 2024
@renchap renchap linked an issue Apr 26, 2024 that may be closed by this pull request
Copy link
Sponsor Member

@renchap renchap left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Github actions will need to install libvips42 instead of imagemagick.

There are probably some ImageMagick related files to remove, for example config/imagemagick.

The Docker image will also need to be changed to install vips, @vmstan started to work on it. Maybe you can update it in this PR to simply install libvips42 in place of imagemagick, and @vmstan can open another PR later to get it installed better (because the current Debian package enabled far too many formats for our usage).

We will also need to take great care in the upgrade instructions to tell people that they need to install VIPS now (and uninstall IM?). This will also be an issue with people running on nightly. The alternative is to support both IM and VIPS for 4.3, with a config option to enable VIPS, show a deprecation warning with IM is used, and remove IM support in 4.4, but I do not know if this is worth it.

lib/paperclip/blurhash_transcoder.rb Outdated Show resolved Hide resolved
@renchap
Copy link
Sponsor Member

renchap commented May 5, 2024

It is also probably a good idea to call Vips.block_untrusted (see libvips/ruby-vips#382) so only the trusted loaders (ie, that are covered by the fuzzing testing) are enabled.

Untrusted operations can be seen with vips -l | grep untrusted. There are more details in the 8.13 release notes.

@Gargron Gargron force-pushed the feature-libvips branch 6 times, most recently from 4e72551 to f29cce2 Compare May 6, 2024 03:05
@Gargron Gargron force-pushed the feature-libvips branch 3 times, most recently from 9bd1761 to 79559f3 Compare May 6, 2024 21:28
@Gargron Gargron marked this pull request as ready for review May 7, 2024 00:17
Copy link
Contributor

github-actions bot commented May 7, 2024

This pull request has merge conflicts that must be resolved before it can be merged.

@shleeable
Copy link
Contributor

Worth adding libheif?

@vmstan
Copy link
Sponsor Contributor

vmstan commented May 8, 2024

Worth adding libheif?

It's included by the libvips42 package.

@vmstan
Copy link
Sponsor Contributor

vmstan commented May 8, 2024

I’ve been running this PR for the last 20 hours on vmst.io and it’s been largely successful. I have had no user reported issues uploading and processing image content, but noticed a few entries in Sidekiq dead queue triggered by a few remote posts.

extract_area: bad extract area

https://plasmatrap.com/notes/9t1auqkbl0
(There are many entries all related to this note via replies, etc, but this is the source post)

Related backtrace:

/usr/local/bundle/gems/ruby-vips-2.2.1/lib/vips/operation.rb:228:in `build'
/usr/local/bundle/gems/ruby-vips-2.2.1/lib/vips/operation.rb:481:in `call'
/usr/local/bundle/gems/ruby-vips-2.2.1/lib/vips/image.rb:229:in `method_missing'

hist_find_ndim: image is not 1 - 3 bands

https://graphics.social/users/metin/statuses/112404645439099091
https://mastodon.online/users/spocko/statuses/112401772350190886
https://mastodon.social/users/wearenew_public/statuses/112401409861904050
https://grafana.social/@grafana/112406569119122877

Related backtrace:

/usr/local/bundle/gems/ruby-vips-2.2.1/lib/vips/operation.rb:228:in `build'
/usr/local/bundle/gems/ruby-vips-2.2.1/lib/vips/operation.rb:481:in `call'
/usr/local/bundle/gems/ruby-vips-2.2.1/lib/vips/image.rb:229:in `method_missing'

VipsForeignSave: "/tmp/c0ddd4cf006863154a6ba8d7b30396bd20240507-7-ivyc3w.jfif" is not a known file format

https://mastodon.art/users/catrionaroberts/statuses/112400639207525684

Related backtrace:

/usr/local/bundle/gems/ruby-vips-2.2.1/lib/vips/image.rb:597:in `write_to_file'
/opt/mastodon/lib/paperclip/lazy_thumbnail.rb:47:in `make'
/usr/local/bundle/gems/kt-paperclip-7.2.2/lib/paperclip/processor.rb:33:in `make'

The JFIF file format error appears to be related to libvips/libvips#3775 which was fixed by a later version of libvips than what comes from the Debian repo. After this error appeared I switched to using the latest version of libvips compiled into the Docker container using vmstan#27 which will be submitted as a change after this PR has merged.

@vmstan
Copy link
Sponsor Contributor

vmstan commented May 8, 2024

As far as performance data is concerned, so far I would say libvips has led to slightly lower container CPU utilization. I don't have any fine grained metrics on image processing performance, but comparing the last day worth of traffic to the same time period in the previous week, which had similar amounts of Sidekiq traffic processed:

Tuesday 7 into Wednesday 8 May 2024
CleanShot 2024-05-08 at 09 04 34@2x

Tuesday 30 April into Wednesday 1 May 2024
CleanShot 2024-05-08 at 09 05 49@2x

Average CPU usage (blue line) appears consistently lower, as do all load averages.

These are on three, 4x dedicated vCPU droplets at Digital Ocean. Average Sidekiq jobs per day approx 1.4 million.

@vmstan
Copy link
Sponsor Contributor

vmstan commented May 9, 2024

I've run into an issue where uploading a PNG file that exceeds the allowed dimensions/megapixels results in an error of 422 Validation failed: File must be less than 16 MB, File file size must be less than 16 MB even if the file itself is less than 16 MB.

CleanShot 2024-05-09 at 16 09 20@2x

Uploading a file that is exactly 2160x3840 works, but a file that is 2161x3842 will result in the above error. In both cases the file is roughly 4.6 MB in size. So not only is the wrong type of error being triggered, it doesn't seem like any error should be triggered if the file was rescaled by vips.

I do not see this issue with JPG or HEIF files.

@LeoEurope
Copy link

I’ve been running this PR for the last 20 hours on vmst.io and it’s been largely successful.

@vmstan As a switch to libvips supposedly allows WebP and AVIF files to be rendered in link previews (closing #27370 and #14983, and probably many more) do you see that many of the issues listed here are resolved?

@vmstan
Copy link
Sponsor Contributor

vmstan commented May 10, 2024

I’ve been running this PR for the last 20 hours on vmst.io and it’s been largely successful.

@vmstan As a switch to libvips supposedly allows WebP and AVIF files to be rendered in link previews (closing #27370 and #14983, and probably many more) do you see that many of the issues listed here are resolved?

I test posted all of the links shared in that comment and they looked the same on my instance running with vips as they do on mastodon.online running ImageMagick and similar build of 4.3.

@LeoEurope
Copy link

LeoEurope commented May 11, 2024

I’ve been running this PR for the last 20 hours on vmst.io and it’s been largely successful.

@vmstan As a switch to libvips supposedly allows WebP and AVIF files to be rendered in link previews (closing #27370 and #14983, and probably many more) do you see that many of the issues listed here are resolved?

I test posted all of the links shared in that comment and they looked the same on my instance running with vips as they do on mastodon.online running ImageMagick and similar build of 4.3.

OK, that's odd. As I've already bug spammed this ticket enough and this is a libvips implementation ticket let's get to the bottom of the issue using ticket #14983 and consider using ticket #27370 as a potential stopgap until the system accepts WebP and AVIF. Once libvips has properly landed I can be more helpful with troubleshooting.

@vmstan
Copy link
Sponsor Contributor

vmstan commented May 13, 2024

CPU usage before and after over a 14 day period.
CleanShot 2024-05-13 at 09 11 12@2x

@ClearlyClaire ClearlyClaire changed the title Change image processing from ImageMagick to libvips Add support for libvips in addition to ImageMagick Jun 4, 2024
@ClearlyClaire
Copy link
Contributor

does this fix mastodon's handling of image formats that offer both lossless and lossy?

(upload a lossless webp currently and it will re-encode it to q90 lossy, which looks significantly worse)

I don't think this PR changes that significantly, and we want to avoid large file sizes in general. But if you have an example file, I can have a look at what the quality ends up like.

ClearlyClaire
ClearlyClaire previously approved these changes Jun 5, 2024
Copy link
Contributor

@ClearlyClaire ClearlyClaire left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am still not 100% sure about the color extraction code, but if my understanding of it is correct, the worst case scenario is rare and just results in suboptimal color selection, not any functional error.

I will continue investigating this, but this should not block a merge.

@ClearlyClaire ClearlyClaire added this pull request to the merge queue Jun 5, 2024
Merged via the queue into main with commit 5f15a89 Jun 5, 2024
61 checks passed
@ClearlyClaire ClearlyClaire deleted the feature-libvips branch June 5, 2024 19:21
@Ember-ruby
Copy link

does this fix mastodon's handling of image formats that offer both lossless and lossy?
(upload a lossless webp currently and it will re-encode it to q90 lossy, which looks significantly worse)

I don't think this PR changes that significantly, and we want to avoid large file sizes in general. But if you have an example file, I can have a look at what the quality ends up like.

webp's 90 quality generally has significantly worse compression artifacts than others at 90 quality as mastodon uses the same quality for all, this is quite an issue for WEBP
not sure if/how this is different in vips, but doing a basic look-through i don't find what quality settings are parsed to it, and with some basic CLI testing, it seems to exhibit the same issues

(a few test images, encoded with imagemagik, same arguments as GLOBAL_CONVERT_OPTIONS, except where noted. all images 3607x2300px, scaled down from a larger photo to meet the pixel limit.)

this particular scene is probably one of the worst cases for WEBP compression, but it's not exclusive to this case, and it's not that uncommon

https://bin.blobfox.coffee/file/e7WGQe/mastodon-quality-test-image-reference.png
6.8MiB

https://bin.blobfox.coffee/file/dGZK0b/mastodon-quality-test-image-lossless.webp
4.8MiB - lossless WEBP, better compression than PNG

At the moment mastodon will reencode that lossless WEBP into the image below, so if you want to upload something losslessly you will have to upload as a usually less compression efficient PNG, wasting your time, compute, and storage

https://bin.blobfox.coffee/file/dw81re/mastodon-quality-test-image-default.webp
121KiB - what it will be encoded to, quite bad quality, very noticeable artefacts.
(in testing on my instance it produced an even worse quality image than this at ~100KiB)

https://bin.blobfox.coffee/file/ejNxze/mastodon-quality-test-image.jpg
554KiB - quality pretty okay imo, particularly for half a megabyte at ~8.2 Megapixels

https://bin.blobfox.coffee/file/aOo3Ya/mastodon-quality-test-image-jpeg-like.webp
305KiB - this is using the -define webp:emulate-jpeg-size=true option, it's better than default but probably still a bit too low

i would guess that checking which method an image in a codec capable of both lossless and lossy modes uses would reduce media storage usage, due to the above, and increasing the quality of lossy WEBP to be more visually on par with JPEG would help with getting people to use the more efficient codec, rather than being appalled at their image being immensely compressed.

uhh, TLDR: lossless images in lossy codecs should be treated like PNG (WEBP at least stores metadata on whether it is or isn't lossless), and WEBP's quality 90 produces much worse looking images than JPEG quality 90

sorry if this is an excessive reply, lol. /gen

@mjankowski
Copy link
Contributor

I realize this is a month old and merged already, but curious on the added CI config here -- is the only difference between the newly added VIPS section and the regular spec run -- a) the addition of the env var to opt-in to vips, b) running just the paperclip_processing subset of specs on the vips section?

If so ... I wonder if we could just tack this on to the end of the other runs (in ruby version matrix)? Those specs take ~30s, but we're tacking on another ~1min per ruby version for the setup surrounding them.

@ClearlyClaire
Copy link
Contributor

Yes, those are about the only changes. I suppose moving them to an additional step in the regular tasks may make sense.

@mjankowski
Copy link
Contributor

Looked a bit more and realized that the VIPS job also switched the runner to ubuntu-24.04 (the default GH actions ubuntu-latest is still 22.x). I assume this gets us the correct/latest vips packages.

I also thought about this a bit more and it might not make sense to tack on more tasks to the end of what is already the longest task ... even though we are repeating the ruby setup, the vips task woul be done before the main long rspec run. So tacking it would reduce the total CI minutes in a given day (fewer over ruby setup runs), but not shorten the length of the run for a single PR, if that makes sense.

May re-visit this when GH promotes ubuntu 24 to be the -latest tag.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
performance Runtime performance
Projects
None yet
8 participants